Various Criteria of Collocation Cohesion in Internet: Comparison of Resolving Power

نویسندگان

  • Igor A. Bolshakov
  • Elena I. Bolshakova
  • Alexey P. Kotlyarov
  • Alexander F. Gelbukh
چکیده

For extracting collocations from the Internet, it is necessary to numerically estimate the cohesion between potential collocates. Mutual Information cohesion measure (MI) based on numbers of collocate occurring closely together (N12) and apart (N1, N2) is well known, but the Web page statistics deprives MI of its statistical validity. We propose a family of different measures that depend on N1, N2 and N12 in a similar monotonic way and possess the scalability feature of MI . We apply the new criteria for a collection of N1, N2, and N12 obtained from AltaVista for links between a few tens of English nouns and several hundreds of their modifiers taken from Oxford Collocations Dictionary. The nounits own adjective pairs are true collocations and their measure values form one distribution. The nounalien adjective pairs are false collocations and their measure values form another distribution. The discriminating threshold is searched for to minimize the sum of probabilities for errors of two possible types. The resolving power of a criterion is equal to the minimum of the sum. The best criterion delivering minimum minimorum is found.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of Organizational Social Cohesion Model between Telecommunication of Isfahan and Shahid Montazeri Power Plant

The present research has been conducted in two quantitative and qualitative sections. In the qualitative section, and based on Grounded Theory, the organizational social coherence model was presented. In the quantitative section, the data obtained from the questionnaires was analyzed at two levels of descriptive and inferential statistics including structural equations and through the SPSS and ...

متن کامل

On the crack propagation modeling of hydraulic fracturing by a hybridized displacement discontinuity/boundary collocation method

Numerical methods such as boundary element and finite element methods are widely used for the stress analysis in solid mechanics. This study presents boundary element method based on the displacement discontinuity formulation to solve general problems of interaction between hydraulic fracturing and discontinuities. The crack tip element and a higher order boundary displacement collocation techn...

متن کامل

On Detection of Malapropisms by Multistage Collocation Testing

Malapropism is a (real-word) error in a text consisting in unintended replacement of one content word by another existing content word similar in sound but semantically incompatible with the context and thus destructing text cohesion, e.g.: they travel around the word. We present an algorithm of malapropism detection and correction based on evaluating the cohesion. As a measure of semantic comp...

متن کامل

An Approximate Solution of Functionally Graded Timoshenko Beam Using B-Spline Collocation Method

Collocation methods are popular in providing numerical approximations to complicated governing equations owing to their simplicity in implementation. However, point collocation methods have limitations regarding accuracy and have been modified upon with the application of B-spline approximations. The present study reports the stress and deformation behavior of shear deformable functionally grad...

متن کامل

THE COMPARISON OF EFFICIENT RADIAL BASIS FUNCTIONS COLLOCATION METHODS FOR NUMERICAL SOLUTION OF THE PARABOLIC PDE’S

In this paper, we apply the compare the collocation methods of meshfree RBF over differential equation containing partial derivation of one dimension time dependent with a compound boundary nonlocal condition.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008